[00:00.940 --> 00:05.820]  Our next talk epitomizes what this village is all about.
[00:06.000 --> 00:10.120]  To put it bluntly, packet hacking doesn't happen without packets.
[00:10.120 --> 00:12.740]  And also, PCAPSR didn't happen.
[00:12.820 --> 00:16.160]  Chris Arbella and Pete Anderson have spent decades working with
[00:16.160 --> 00:18.000]  Fortune 500 knockin' socks
[00:18.000 --> 00:23.270]  to implement advanced packet analysis solutions,
[00:23.540 --> 00:25.480]  better packet pipelines,
[00:25.480 --> 00:27.680]  and to get more from those packets.
[00:27.680 --> 00:30.000]  It is my pleasure to introduce to you
[00:30.000 --> 00:31.520]  Chris Arbella and Pete Anderson
[00:31.520 --> 00:36.560]  with their talk, Packet Acquisition, Building the Haystack.
[00:38.240 --> 00:40.140]  Take it away!
[00:40.340 --> 00:44.720]  Hi everybody, and welcome to Packet Acquisition, Building the Haystack.
[00:44.720 --> 00:46.680]  I'm Chris Arbella.
[00:47.200 --> 00:48.600]  And I'm Pete Anderson.
[00:48.600 --> 00:53.580]  And we've been designing large-scale enterprise packet acquisition
[00:53.580 --> 00:56.160]  capture solutions for a long time.
[00:56.160 --> 01:00.920]  Most of that is mostly Pete, if you can't tell.
[01:01.240 --> 01:03.200]  So what are we going to talk about today?
[01:03.220 --> 01:07.560]  So most of you have probably used Wireshark,
[01:07.560 --> 01:09.980]  you know what a packet capture file is.
[01:10.080 --> 01:12.120]  But when you look at a big environment,
[01:12.120 --> 01:14.960]  you might wonder, where does that data come from?
[01:14.960 --> 01:18.260]  How can I get it from all the different strategic points
[01:18.260 --> 01:19.400]  in a large environment?
[01:19.400 --> 01:20.760]  And that's what we're going to talk about today.
[01:20.760 --> 01:23.880]  How do you instrument your environment for packet capture?
[01:24.020 --> 01:25.780]  And there's various different topics.
[01:25.780 --> 01:27.280]  We'll start with local packet capture.
[01:27.280 --> 01:29.540]  Then we're going to talk about how do you scale it out?
[01:29.540 --> 01:31.540]  How do you cap-span your environment?
[01:31.920 --> 01:34.700]  And then what are some of the other considerations
[01:34.700 --> 01:36.380]  you have to deal with in the enterprise?
[01:36.380 --> 01:39.540]  And then at the end, we'll talk a little bit about cloud and containers,
[01:39.540 --> 01:42.220]  because those are some of the newer topics that we're all dealing with
[01:42.680 --> 01:44.780]  in the packet capture world today.
[01:47.480 --> 01:48.340]  All right.
[01:48.340 --> 01:55.300]  So before we dive in to talking about how we're going to build out
[01:55.300 --> 01:59.280]  packet capture infrastructure, I want to talk a little bit about
[01:59.280 --> 02:00.720]  what are packets good for?
[02:00.720 --> 02:02.640]  Why do we want to capture them in the first place?
[02:02.660 --> 02:03.840]  Why do we care?
[02:04.580 --> 02:06.440]  Exactly. Why should you care about this?
[02:06.440 --> 02:08.120]  Why do you want to build one of these out?
[02:08.140 --> 02:09.340]  So what are they good for?
[02:10.360 --> 02:12.520]  So some of you may have heard the expression,
[02:12.520 --> 02:13.860]  packets or it didn't happen.
[02:13.920 --> 02:16.360]  Basically, they are a ground truth.
[02:16.700 --> 02:20.220]  You see something in a packet trace, that's what happened.
[02:20.320 --> 02:22.380]  They're definitive information.
[02:22.380 --> 02:25.880]  And most attacks, exploits, breaches, etc.,
[02:25.880 --> 02:28.780]  are going to involve traffic moving over the network,
[02:28.780 --> 02:29.980]  packets going over the network.
[02:29.980 --> 02:34.000]  And something I always say is that packets can tell you something about everything.
[02:37.630 --> 02:39.950]  So I'm not going to read through all of these,
[02:39.950 --> 02:42.010]  but there's a lot of different use cases.
[02:42.010 --> 02:45.570]  And I always like to use the analogy of logs.
[02:45.830 --> 02:50.330]  So a lot of you have probably dealt with log files and log aggregations.
[02:50.330 --> 02:56.030]  Very similar to packets in that they're good for all different types of things.
[02:56.030 --> 02:58.790]  They can answer all different types of questions.
[02:59.210 --> 03:04.050]  And actually, some of the challenges and trade-offs are pretty similar as well.
[03:05.890 --> 03:09.570]  So there's a bunch of different use cases for packets,
[03:09.650 --> 03:11.790]  a bunch of different types of packet analysis.
[03:11.790 --> 03:16.910]  The one that most of you are probably familiar with is offline packet analysis.
[03:16.910 --> 03:19.570]  So when someone gives you a PCAP or you've gathered a PCAP,
[03:19.570 --> 03:25.290]  and you use a tool like Wireshark to go and parse through that PCAP file.
[03:25.570 --> 03:29.390]  But there's also real-time analysis solutions.
[03:29.430 --> 03:32.130]  Things that do shallow packet inspection,
[03:32.130 --> 03:35.490]  where you're basically looking at the DCP IP5 tuple,
[03:35.490 --> 03:38.530]  what ports are talking, what IP addresses are talking.
[03:38.690 --> 03:42.570]  And honestly, if you're doing that, just use NetFlow.
[03:42.570 --> 03:46.930]  But then it gets more interesting once you start getting into deep packet inspection,
[03:46.930 --> 03:49.870]  where you're actually looking at the application payload.
[03:49.870 --> 03:52.630]  You're looking at L5, L6, L7.
[03:53.350 --> 03:57.070]  And especially when you start becoming session aware.
[03:57.110 --> 04:00.330]  So if you've used Snort, Siricata, Zeek,
[04:00.330 --> 04:04.870]  once you have something that can tell you when a connection has actually been established.
[04:05.030 --> 04:08.710]  And even getting to the point where you can do full reassembly,
[04:08.710 --> 04:15.550]  as in the case of doing file extraction from, whether it's HTTP or SMB, picket poisoning.
[04:15.970 --> 04:21.170]  Kind of the flip to that offline packet analysis is continuous packet capture,
[04:21.170 --> 04:24.290]  where you're just writing packets to disk as fast as you can.
[04:24.290 --> 04:30.150]  You have a rolling buffer, and you just keep filling those disks with raw packets.
[04:30.150 --> 04:35.050]  A good example of this is Google's open source stenographer,
[04:35.050 --> 04:38.570]  which again is meant for high speed packets to disk.
[04:41.070 --> 04:44.830]  So a little bit about the basics. Where do we start?
[04:44.890 --> 04:47.950]  We kind of already mentioned some of these technologies,
[04:47.950 --> 04:52.590]  but when you're just grabbing packets from a single host, from your local host,
[04:52.590 --> 05:00.030]  most common you're going to use something like TCP dump, Tshark, InGrep, or Wireshark.
[05:00.030 --> 05:01.550]  You're actually using a GUI.
[05:01.550 --> 05:06.610]  There are some other ways of getting packets off of a host,
[05:06.610 --> 05:12.130]  RPCAPD or remote PCAP, it's a part of PCAP, I believe.
[05:12.130 --> 05:16.570]  But it actually lets you take hosts from a local interface,
[05:16.570 --> 05:21.030]  and then direct them to some remote destination.
[05:21.610 --> 05:27.490]  These are good for one-offs, troubleshooting, forensics, when you have a host isolated.
[05:27.490 --> 05:32.830]  But this doesn't really scale to an actual enterprise deployment.
[05:36.770 --> 05:41.950]  Alright, so speaking of scale, how do we scale out?
[05:41.950 --> 05:47.650]  And the thing we're going to talk about first is identifying critical networks.
[05:47.650 --> 05:53.730]  So before you start implementing anything, by far the most important step
[05:53.730 --> 05:57.830]  is to identify what you care about and what you want to capture.
[05:57.830 --> 06:00.730]  Enterprise networks are very complicated.
[06:01.310 --> 06:03.970]  If you remember, I used the log analogy before.
[06:03.970 --> 06:06.670]  You're never going to capture every packet. It's impossible.
[06:06.670 --> 06:11.570]  It's just like you're never going to write every single log out on every host in your environment,
[06:11.570 --> 06:15.770]  and then ship all that over somewhere in the store. Very similar.
[06:15.770 --> 06:18.750]  So we have to target key network segments.
[06:18.750 --> 06:22.070]  And the questions to think about are, where are your critical assets?
[06:22.630 --> 06:26.490]  Where is your critical data stored? And how does that get accessed?
[06:27.070 --> 06:33.630]  And you want to focus on those network paths and capture traffic there.
[06:33.630 --> 06:38.030]  Because if you have a breach, if you have an attacker, if you have lateral movement,
[06:38.030 --> 06:42.310]  that's the stuff they're trying to get to, and that's where you really want to focus your attention.
[06:44.790 --> 06:51.590]  Some common examples of those critical assets, things in our experience,
[06:51.590 --> 06:58.470]  Active Directory or any authentication service, both internal within your network for your users,
[06:58.470 --> 07:02.270]  as well as for your customers. So you run the website,
[07:02.270 --> 07:06.950]  making sure that you actually are monitoring the authentication services for that website.
[07:06.950 --> 07:13.510]  Get repos, and actually tied to get repos, anything tied to the CICD pipeline.
[07:13.650 --> 07:20.590]  There was some exports around publicly exposed Jenkins endpoints, as an example.
[07:20.590 --> 07:27.610]  File shares, databases and data lakes, any sort of PCI network, even mainframes.
[07:29.450 --> 07:35.430]  Basically, does sensitive data live there? Is it something that is critical to your business making money?
[07:35.610 --> 07:40.910]  It is probably a critical network.
[07:40.910 --> 07:44.370]  So after you've picked out what your critical networks are,
[07:44.670 --> 07:48.890]  we have some prioritization on where we want to get the data from.
[07:48.890 --> 07:53.370]  How do we actually start gathering data from those networks?
[07:53.370 --> 07:56.830]  And there's really two common ways of doing it.
[07:56.830 --> 08:02.870]  The first is taps. The second is spans. A tap is actually purpose-built hardware.
[08:02.870 --> 08:09.430]  It's going to physically break the link between two different pieces of network infrastructure
[08:09.430 --> 08:14.350]  and take a passive copy from that individual link.
[08:14.350 --> 08:21.510]  So again, it's tied to everything. You'll need one tap for every link that you're trying to get packets from.
[08:21.810 --> 08:27.010]  On the span side, it's a configuration on network infrastructure, typically a switch.
[08:27.010 --> 08:30.130]  It's switched port analyzer, what span stands for.
[08:30.130 --> 08:35.730]  You'll also hear mirroring or monitor sessions. Those are all the same phrase.
[08:36.470 --> 08:46.690]  But it's taking a digital copy from the data plane of the network infrastructure and directing that copy to another location.
[08:48.050 --> 08:55.070]  So taps, they're less common than spans, and that's mostly because of the cons of taps.
[08:55.070 --> 09:03.110]  So I mentioned it's actually tied to individual links, which means that for every link you want to monitor,
[09:03.110 --> 09:08.010]  you actually need another tap. So you end up having to pay for many, many different taps,
[09:08.010 --> 09:15.270]  which also has an administrative burden, both from the outage required when you break that wire,
[09:16.050 --> 09:21.210]  as well as just how many individual taps that you have and you have to cable.
[09:21.470 --> 09:30.190]  On the other hand, because it's tied to the individual link, because it's actually a physical tap into that data feed,
[09:30.190 --> 09:38.370]  you end up with the best possible data feed. You can actually see on the screen how a fiber tap works,
[09:38.370 --> 09:47.730]  where you're actually slightly bending a cable and using refraction through that cable to pull out some percentage of the light passing over that fiber.
[09:49.670 --> 09:58.970]  So additionally, because it's tied to the physical infrastructure, there's no impact on any sort of switches, any sort of routers, anything like that.
[09:58.970 --> 10:05.710]  Yeah, so taps don't cause packet loss and they preserve packet order and timing perfectly.
[10:08.490 --> 10:14.690]  So where should you tap? There's typically a couple of different places that we see it.
[10:14.690 --> 10:21.130]  One is transit links, so north-south boundary into and out of a data center, for example.
[10:21.130 --> 10:33.350]  You'll also often see it between link layers or links between layers in a traditional three-tier network architecture, whether that's cord distribution or distribution to access.
[10:33.890 --> 10:42.970]  And in spine-leaf networks, you're actually going to most commonly see it on the link between the spines and the leaves themselves.
[10:42.970 --> 10:49.510]  And the reason for these different points of tapping, they give you the most bang for your buck.
[10:49.510 --> 10:54.990]  You'll have the most complete data set for critical network junctures.
[10:59.420 --> 11:05.820]  So spans. Spans are what you're going to run into most of the time, and they're what you're going to use most of the time.
[11:05.820 --> 11:09.220]  Taps are actually great, and in a perfect world, you would tap stuff all over the place.
[11:09.220 --> 11:14.860]  The problem is they're really expensive, and Chris mentioned some of the other cons.
[11:15.100 --> 11:20.540]  So we're going to use spans. Now, why are spans so great? Because guess what? You've already got them.
[11:20.540 --> 11:28.100]  So the vast majority of managed switches support at least some type of span port veering technology.
[11:28.520 --> 11:33.420]  They are, indeed, inexpensive and ubiquitous, but it's extremely flexible, too.
[11:33.420 --> 11:46.560]  So when you think about a tap, you're tapping a link. With spans, the basic idea is you're spanning a port, but there are a lot of more advanced configurations around VLAN, filtering, and other things along those lines.
[11:46.560 --> 11:50.360]  And you can target traffic very specifically with a span.
[11:50.820 --> 11:56.200]  So, sounds like spans are great. Why don't we just always use them? So there are a couple of downsides.
[11:56.780 --> 12:10.700]  The data can be lower fidelity. You really have to watch out for over-subscribing your destination and having what I sometimes call observation drops, which is essentially a packet loss on the span destination.
[12:12.100 --> 12:20.760]  Occasionally, you'll see some negative impact on switch performance. It's pretty rare. It can happen. It's less common than it used to be.
[12:21.100 --> 12:31.960]  Now that switch hardware has gotten more powerful, but every once in a while, you'll have a switch vendor make an implementation change, which I'm sure some of you are familiar with, that can cause issues.
[12:31.960 --> 12:41.300]  But generally, that's pretty rare. And overall, this is the approach that you're going to use the majority of the time, at least in most environments.
[12:41.300 --> 12:49.320]  I'd say 98% of the environments that I work in are all based off of spans versus caps.
[12:53.640 --> 13:01.220]  So let's talk a little bit about the types of spans. So you may hear these different terms. Span, RSpan, ERspan. What do they all mean?
[13:01.360 --> 13:07.460]  So a span, or a port mirror, that is going to be the most common span type, and that is going to be local to a single switch.
[13:07.460 --> 13:19.640]  So a simple example. I want to span all the traffic going in and out of port 1, and I'm going to send it all to port 2, and I'm going to put my packet analysis, packet capture solution on port 2.
[13:21.000 --> 13:31.000]  That's going to be your most common span type. RSpan, you're not going to run into very much, at least in my experience. I've been doing this for almost 20 years. I think I've run into RSpan like twice.
[13:31.200 --> 13:36.640]  And both times it didn't work that well, so that's probably why I never run into it.
[13:36.640 --> 13:38.220]  I've never had it.
[13:40.180 --> 13:49.780]  Exactly. So basically, an RSpan is your source is on one switch, your destination can be on the other, and you use an L2 transport to get there.
[13:50.980 --> 13:59.760]  We're going to talk about this later, but generally when you have a situation like that, an aggregation solution is much more preferable, and I think that's one of the big reasons we don't see RSpan.
[14:01.120 --> 14:10.380]  ERspan is a little bit more common. It's somewhat similar to an RSpan, except that instead of a Layer 2 transport, it uses a Layer 3 transport.
[14:10.380 --> 14:20.360]  So you can span stuff on one switch and send it to another, and it does that by encapsulating the spanned packets in a Layer 3 GRE tunnel.
[14:20.360 --> 14:41.420]  In a traditional switch network, it's somewhat uncommon, but when you get into the VMware world, it is more common because the VMware distributed switch supports an ERspan, so you can essentially span virtual ports on that virtual distributed switch and then send it somewhere else.
[14:41.420 --> 15:05.180]  It's also fairly common in Spineleaf networks, although that is a little bit different because although it's technically an ERspan, it doesn't leave the... you're not always going over the data network, it's usually using that underlay network that a lot of the Spineleaf implementations use, so it doesn't leave it that way.
[15:06.100 --> 15:20.700]  The big caveat with ERspan is that it can put a lot of traffic on your network, right? If you're spanning everything in and out of a port and then you're encapsulating it and sending it over your IP network, you're literally doubling whatever that traffic is, now it's on your network twice.
[15:21.020 --> 15:31.560]  The other thing you have to keep in mind is there is some encapsulation overhead, so you've got to make sure jumbo frames are supported everywhere, because otherwise packets are going to be too big.
[15:32.820 --> 15:50.700]  I'll say I've seen, especially in VMware environments, sometimes you'll actually have dedicated monitoring interfaces, physical interfaces on the physical hosts that are dedicated for ERspan. That way you're avoiding putting monitor traffic onto the data networks.
[15:50.700 --> 16:04.240]  Actually, Chris brings up a great point there. If I was designing an environment from scratch, I would absolutely make it a requirement to have a second 10 gig NIC in every single virtual host and have that connect to an out-of-band management network.
[16:04.240 --> 16:09.380]  Then you could use ERspan to your heart's content without impacting your data network at all.
[16:12.430 --> 16:27.050]  So this is a quick example, or a couple of examples, of how you might configure a SPAN session. And you can see a link to a documentation down below. This is Cisco specific. Most vendors are fairly similar though.
[16:29.070 --> 16:35.930]  I will tell you, I've been doing this a long time, there are always slight little differences in the different software versions.
[16:35.930 --> 16:49.790]  So it's always a good idea to Google, hey, I've got this particular switch with this version of iOS. If we're talking Cisco, obviously, if you have another vendor, do whatever the similar search would be and make sure you have the right syntax.
[16:50.530 --> 16:56.030]  Cisco likes, at least, to change their syntax quite a bit from version to version.
[16:56.030 --> 17:08.090]  It's funny, as I said, I've been doing this almost 20 years, and it's always looked pretty much like this, but the way you arrange your VLANs has changed over the years and things of that nature.
[17:08.670 --> 17:22.650]  By the way, so one of the things we're going to be talking about here is you'll notice that in one case we have an example of spanning a port, in another one we have an example of spanning a VLAN, and then you'll notice how you can send the traffic to a destination.
[17:22.650 --> 17:32.710]  So a lot of times, by the way, you're not just limited to one destination, you can do multiple destinations, and I'll talk about this a little more, but virtual interfaces are often supported.
[17:33.030 --> 17:36.830]  You'll notice there's some directionality there as well, which is something else we'll talk about.
[17:39.590 --> 17:51.250]  Alright, so, as I said, we were going to talk about some of the spam sources. So, at its most basic, right, a spam is going to be just spanning a port or multiple ports.
[17:52.290 --> 17:57.090]  And when you think of an interface, right, you might think, hey, yes, it's just an Ethernet interface on a Switch.
[17:57.090 --> 18:08.310]  But a lot of Switch vendors also support things like virtual interfaces and aggregate interfaces, so in the Cisco world, like a port channel, you can spam a port channel, essentially.
[18:08.450 --> 18:18.650]  And so what's the use case of spamming a particular port? It's usually in a large environment to target a particular network path or link.
[18:18.650 --> 18:28.670]  Now, if you have a very small environment where you have, you know, like one Switch, like I have one Cisco Switch in my home environment, right, so I just spam every port on the Switch.
[18:29.470 --> 18:38.890]  That is going to be less common in an enterprise environment, but if you do have a small environment, it's a reasonable approach to just spam all the ports.
[18:40.170 --> 18:49.850]  However, when you get to the enterprise, it's much more common to use a VLAN spam, right, because most modern networks, things are segmented by VLAN.
[18:52.050 --> 18:57.930]  So your source is going to be a VLAN or multiple VLANs, right? Typically, when I do this, it's multiple VLANs.
[18:57.930 --> 19:03.350]  I'll go in and we'll say, hey, where are all the critical servers? Let's spam all those VLANs, and that's a lot of times where we start.
[19:03.350 --> 19:10.390]  How often do people actually know what lives on which VLANs?
[19:11.430 --> 19:14.550]  Well, that varies, of course.
[19:14.670 --> 19:29.770]  It's usually, in my case, a discovery where you'll start with a wide caster net wide and capture many, many VLANs and then prune off as you validate what traffic lives where.
[19:30.550 --> 19:39.750]  And I've actually done the opposite approach to, you know, we'll start and we'll say, OK, we're going to start with these VLANs and then inevitably you'll go look for traffic and it's not there.
[19:39.950 --> 19:45.190]  And then, you know, you add some more. So Chris brings up a good point, right? I mean, there's always going to be some tweaking.
[19:45.190 --> 19:49.730]  It's not like, hey, you have to have everything set in stone on day one.
[19:50.870 --> 19:57.130]  You'll want to take your best shot at it, what you think you need, and then refine your data from there.
[19:58.230 --> 20:05.930]  So a couple more notes on this. With port spans, by the way, typically you're going to span in both directions, so ingress and egress.
[20:06.090 --> 20:11.650]  With VLANs, you only need to span ingress. And in fact, some newer switches only support ingress.
[20:11.650 --> 20:14.850]  So you won't even have the choice anyway. So you don't even have to worry about it.
[20:15.170 --> 20:18.050]  And then the last thing I'll say, a couple of last things on VLANs.
[20:18.050 --> 20:23.070]  So they're a really good way to target certain assets, which is the reason they're the most common method.
[20:23.070 --> 20:28.230]  And then when we get into Spineleaf, there are some similar concepts in there.
[20:28.370 --> 20:32.190]  They just have different names because, of course, we want to confuse everyone.
[20:35.760 --> 20:38.300]  So what are some common span issues?
[20:38.460 --> 20:43.160]  So number one is probably oversubscribed monitor reports.
[20:43.160 --> 20:48.340]  And I'll give you the simple explanation of this. Believe it or not, you can have a single...
[20:48.340 --> 20:55.600]  if you span a single 1GIG interface in both directions, you can completely overrun a destination 1GIG interface.
[20:56.060 --> 20:58.080]  And you say, Pete, how is that possible?
[20:58.080 --> 21:03.220]  Well, think of it this way, an interface sends traffic in both directions, right?
[21:03.220 --> 21:08.560]  So you're not... a 1GIG interface can actually have a total of 2 gigabits per second of traffic, right?
[21:08.560 --> 21:10.340]  If you combine the ingress and the egress.
[21:10.340 --> 21:18.740]  However, your destination interface that's sending the traffic to your analysis solution, that can only send at 1 gigabit per second, right?
[21:18.760 --> 21:23.020]  Now, in reality, is spanning a 1GIG port going to overrun another 1GIG port?
[21:23.020 --> 21:25.180]  No, almost never, right?
[21:25.180 --> 21:30.380]  So it's very rare for switch ports to consistently run maxed out.
[21:30.820 --> 21:33.400]  But it is something to be aware of, right?
[21:33.440 --> 21:38.200]  And this is where we're going to get into, you know, having multiple destinations and aggregation.
[21:38.440 --> 21:43.340]  In an enterprise environment, your destination port should almost always be at least 10 gigs.
[21:43.840 --> 21:47.120]  Because you're going to... just trust me, you'll run into way less problems.
[21:47.120 --> 21:55.880]  Trying to span out 1GIG interfaces in most modern networks, unless you're in a really small environment, is... you know, you're looking for trouble there.
[21:56.120 --> 22:03.740]  I mentioned this on the previous slide, but when you have an interface span, typically, you're going to want to do both directions.
[22:03.740 --> 22:06.280]  If you don't, you're going to get unidirectional traffic.
[22:06.660 --> 22:10.220]  And you'll be really confused because your packet trace won't make any sense.
[22:11.540 --> 22:12.820]  And we do have a note here.
[22:12.820 --> 22:17.960]  If you go back to my simple, like, super small network example, if you're spanning all the interfaces on a switch,
[22:17.960 --> 22:22.580]  in that case, you could do... you could just do receive.
[22:23.540 --> 22:25.820]  But most of the time, you're going to do both.
[22:25.900 --> 22:30.240]  And then I mentioned this before, but another one is ER-span without JumboFrame support.
[22:30.540 --> 22:35.640]  So I know some implementations of ER-span actually mark the packets do not fragment.
[22:35.640 --> 22:42.260]  So that means if you try to send a full-size packet that's been encapsulated over a network without JumboFrame support,
[22:42.260 --> 22:44.940]  well, it's not going to go anywhere. It can't be fragmented.
[22:53.130 --> 22:54.850]  So where do we span?
[22:55.510 --> 23:01.270]  So in a three-tier architecture, when I say three-tier architecture, this is your traditional core distribution access.
[23:01.630 --> 23:04.190]  Those are typically going to be your three options.
[23:05.090 --> 23:06.790]  So we'll start with the core.
[23:09.470 --> 23:12.750]  I sometimes work with people that span at the core.
[23:13.390 --> 23:14.630]  Depends on the network.
[23:14.990 --> 23:17.150]  I'll get to where we usually span.
[23:17.150 --> 23:21.590]  But when you look at the core, your north-south traffic is going to be there a lot of times.
[23:21.590 --> 23:22.910]  So you can get all your north-south traffic.
[23:22.910 --> 23:25.370]  But your east-west traffic can be pretty limited.
[23:26.190 --> 23:31.630]  Sometimes some networks will really only get routed traffic and things like that.
[23:32.230 --> 23:37.470]  The advantage of spanning at the core is it does not require a lot of cabling or port density.
[23:38.590 --> 23:43.630]  Some networks use more of a collapsed core where a lot of traffic goes to the core.
[23:43.630 --> 23:46.530]  In those cases, it can be good to span at the core.
[23:46.530 --> 23:50.290]  But generally, I don't recommend only spanning at the core.
[23:50.290 --> 23:52.510]  You're just going to miss too much east-west traffic.
[23:53.670 --> 23:59.090]  Usually, the keyword I look for is if a network architect describes their network as flat,
[23:59.950 --> 24:05.510]  then you probably can span at the core and get a decent amount of the traffic.
[24:05.510 --> 24:08.870]  But again, as Pete mentioned, you're still going to be missing a significant amount.
[24:13.360 --> 24:17.560]  So distribution. This, as we say, is often the sweet spot.
[24:18.160 --> 24:22.100]  Typically, you're still going to get... so we mentioned north-south on the last slide.
[24:22.500 --> 24:25.720]  A lot of times you get north-south anyway because you move down the stack.
[24:25.720 --> 24:29.900]  So you may have to do a little supplementing of certain transit networks,
[24:29.900 --> 24:33.240]  or maybe you grab lots of DMV traffic that's up closer to the core,
[24:33.240 --> 24:35.120]  but most of you are north-south.
[24:35.120 --> 24:38.600]  It's going to stop all the way at the bottom of this diagram anyways.
[24:38.600 --> 24:42.860]  So you're going to get it regardless of the layer you span at.
[24:42.860 --> 24:46.080]  And what's the advantage of the distribution over the core?
[24:46.080 --> 24:48.440]  More east-west traffic, period.
[24:48.780 --> 24:54.200]  You're still not going to get all of it because you won't get traffic that never leaves an access switch.
[24:54.200 --> 24:57.180]  So if you have two servers that talk to each other on the same access switch,
[24:57.180 --> 24:59.520]  it's not going to come up in the distribution layer.
[25:00.500 --> 25:07.300]  But you'll get a lot of stuff, and it doesn't have incredibly onerous tabling requirements.
[25:07.300 --> 25:15.160]  So even in large networks, a lot of times we're only talking 8, 12, 16 distribution switches.
[25:15.640 --> 25:19.960]  So it's not like you're going to be running dozens and dozens and dozens of tables,
[25:19.960 --> 25:24.220]  or requiring massive aggregation switches.
[25:28.340 --> 25:29.980]  Finally, the access layer.
[25:29.980 --> 25:35.280]  Now, like I mentioned in a perfect world with PaaS, this is another perfect world situation.
[25:35.280 --> 25:39.360]  If I could just wave a wand, I would just spam it the access layer.
[25:39.360 --> 25:41.460]  Because you're going to get the most traffic.
[25:43.200 --> 25:46.360]  Because you'll get almost all your east-west traffic there.
[25:46.360 --> 25:48.260]  Even hosts on the same switch.
[25:48.520 --> 25:53.880]  The problem is it can require extensive tabling.
[25:53.880 --> 25:59.260]  I mean, we have customers with dozens, hundreds of access switches.
[26:00.760 --> 26:05.380]  And that's a lot of labor and work to cable all that up.
[26:06.180 --> 26:07.540]  You say getting crazy?
[26:07.660 --> 26:11.740]  I do know some organizations who spam all their TopperRack switches though.
[26:11.740 --> 26:13.160]  And they get great coverage.
[26:13.260 --> 26:14.840]  So you have to look at your environment, right?
[26:14.840 --> 26:19.980]  It depends how sprawling it is, how many access switches you have.
[26:20.180 --> 26:23.840]  Sometimes what it might make sense to do is not do all the access switches,
[26:23.840 --> 26:27.400]  but if you have a certain critical one where there's certain access for it, then you can look at those.
[26:27.400 --> 26:35.720]  I think that just gets back to inter-packet acquisition strategy design with a plan.
[26:35.760 --> 26:37.460]  Know what you want to get at.
[26:37.460 --> 26:40.740]  And even if you don't necessarily know where it lives, if you know what you want,
[26:40.740 --> 26:45.140]  you can figure it out over time as you do some discovery.
[26:49.720 --> 26:52.600]  All right, so we talked about 3Gear.
[26:52.600 --> 26:54.600]  Now we're going to talk a little bit about SpineLink.
[26:54.880 --> 26:57.960]  This example is going to be very ACI-specific.
[26:58.220 --> 27:00.220]  Just because that's what I'm most familiar with.
[27:01.620 --> 27:04.640]  But other vendors are pretty similar.
[27:05.140 --> 27:11.680]  The most confusing thing about SpineLink is they decided to use all different terminology to talk about everything.
[27:12.580 --> 27:20.480]  But in some ways, when you start digging into it, it actually can be a little easier than traditional three-tier networks.
[27:20.580 --> 27:23.340]  Number one, there's not three tiers anymore. There's only two, right?
[27:23.340 --> 27:25.560]  There's Spine switches and Leaf switches.
[27:25.980 --> 27:29.680]  And specifically in ACI networks, there's three span types.
[27:29.680 --> 27:35.080]  There's an access span. That is very similar to a traditional span.
[27:35.080 --> 27:37.700]  So an access span is on the Leaf switch.
[27:37.760 --> 27:40.020]  And Leaf switches essentially have two different types of ports.
[27:40.020 --> 27:44.380]  They have access ports, which is traditional switch ports. You hang a server off it.
[27:44.380 --> 27:47.320]  And they have uplink ports, or fabric ports.
[27:47.320 --> 27:49.060]  And that's what talks to the Spine switches.
[27:49.060 --> 27:52.640]  So an access span would be spanning only access ports.
[27:52.820 --> 27:58.840]  And for those, your destination can be local on that switch, or it can be a remote using ER spans.
[27:58.840 --> 28:10.780]  Now, as I mentioned before, while this is technically ER span, it's not exactly the same as, you know, just a, you know, ER span in a vacuum.
[28:10.840 --> 28:15.640]  Because you're doing ER span, but you're going somewhere else on the fabric.
[28:15.640 --> 28:21.720]  And the fabric already has its own transport IP network as part of it.
[28:22.900 --> 28:29.920]  So it's some of the down... you don't have to worry about as many of the downsides of ER span when you're dealing with Spine switches.
[28:30.280 --> 28:32.040]  That's the high level summary.
[28:32.360 --> 28:36.400]  It's much less likely you're going to, you know, stomp on your other traffic.
[28:36.800 --> 28:40.720]  And, you know, in modern timing networks, they're all going to support jumbo-brings.
[28:42.220 --> 28:45.200]  So the second type of span is a fabric span.
[28:45.320 --> 28:51.500]  That's going to be spanning the uplink port between the Leaf and the Spine.
[28:52.300 --> 28:58.540]  One thing to note here is when you go to look at the packets, if you wanted to, like, pull them into Wireshark, they're going to be DX LAN encapsulated.
[28:58.540 --> 29:03.260]  So you're going to have to use... you're going to have to decapsulate them before you analyze them.
[29:03.440 --> 29:16.260]  And actually, if you go beyond Wireshark, it's all... and you're looking at this kind of traffic, your aggregation solution or your analysis solution would also need to be able to decapsulate that traffic.
[29:17.260 --> 29:19.240]  This one always uses ER span.
[29:19.860 --> 29:27.120]  And then the final one, which is the one I at least personally have used the most and dealt with the most, is what they call a tenant span.
[29:27.280 --> 29:30.080]  And that spans one or more endpoint groups.
[29:30.340 --> 29:35.340]  And it's sort of beyond the scope of this presentation to get in to what an endpoint group is.
[29:35.800 --> 29:40.560]  It's somewhat similar to a VLAN, though.
[29:41.540 --> 29:44.320]  And that's why it ends up being a pretty popular approach, right?
[29:44.320 --> 29:46.920]  Because it's a good way to target assets.
[29:46.920 --> 29:50.300]  And kind of nice in Spineleaf, it doesn't really matter.
[29:50.300 --> 29:54.760]  You can span... this endpoint group can be on all different heap switches.
[29:54.900 --> 30:02.440]  And because it all uses ER span, you can pull it all back to one spot and send it over to your analysis solution or your aggregation network.
[30:05.670 --> 30:16.950]  So once we get beyond just three-tier architectures, Spineleaf architectures, we actually end up having to have some additional challenges in enterprise deployments.
[30:16.950 --> 30:20.510]  And one of those is going to be virtual environments.
[30:21.230 --> 30:29.390]  The challenge in a virtual environment is that not all of the network traffic is going to hit a physical network.
[30:29.390 --> 30:48.010]  So the image on the screen, the three virtual hosts that are on host one, if they're talking to each other, that traffic is going to stay on host one, even though they're all hooked into the same virtual distributed switch.
[30:48.770 --> 30:56.790]  However, we always recommend do your physical spanning and tapping first.
[30:56.790 --> 31:06.150]  This is a problem that gets blown out of proportion. You often get 80 plus percent of the traffic that you're looking for.
[31:06.910 --> 31:19.170]  And it's funny we use 80% here because I would say, not only is it like 80% of the traffic you're looking for, I'd say about 80% of the time when people are worried about this, it ends up not being a problem.
[31:19.170 --> 31:37.270]  Pete mentioned already ER span, so virtual distributed switches support ER span for critical workloads. You can target individual interfaces and get that traffic off of the hypervisor host.
[31:37.270 --> 31:45.450]  There's also virtual taps, which are installed on a host by host or VM by VM basis.
[31:45.450 --> 31:59.850]  You start running into the management overhead of installing those virtual taps, but for hypercritical pieces of infrastructure and pieces of applications, this can be a very viable solution.
[31:59.850 --> 32:15.010]  Another less obvious solution is to separate key VMs and make sure that they're on different hosts, forcing their communication to actually traverse a physical network that you already have monitoring in place for.
[32:16.190 --> 32:32.950]  And that last solution, by the way, is it sounds kind of funny. Hey, yeah, we'll just put them on separate hosts. But turns out that that easiest choose part is pretty important, right? Because some people are like, wait, that's all I have to do. And I'm like, yep, that's all you have to do. And they're like, oh, well, that's easy.
[32:33.910 --> 32:41.570]  So, you know, don't, you know, I think we might have mentioned earlier, don't let the perfect be the enemy of the good. And that's a good example of that.
[32:42.390 --> 32:59.670]  Alright, so let's talk a little bit about span aggregation, right? So what happens when you get these large environments is you don't have to just tap and span all over the place, you got to find a way to bring all that traffic back somewhere. So you can feed it one or more analysis and capture solutions.
[32:59.670 --> 33:19.670]  Right? So one thing to note about most of these analysis and capture solutions is they are not designed for port density. So a lot of them use expensive high end network cards, you know, that can cost, you know, over $10,000 each. So they're not going to have 16 of them sitting, you know, in the solution, right?
[33:19.670 --> 33:42.950]  So we need to, we need some way, you know, if I'm spanning 8 or 16 different switches, how do I get that fed into my analysis solution that has, say, you know, 2 or 4 or 10 gig ports on it? So what's the solution? An aggregation switch. And if you're not familiar with aggregation switches, there are some big vendors that do it. So if you've heard of Gigamon or AppCon or HCM, they do it.
[33:43.610 --> 34:04.870]  Some switch vendors like Arista, you can also just configure their switches to be an aggregation switch. And really, when we say it's an aggregation switch, I mean, there are some that are purpose built, but it is a switch. Like the hardware, that's what it is. It's just a switch, except instead of sending traffic over it, you're going to use it to aggregate all your span and tap feeds, right?
[34:04.870 --> 34:31.470]  So essentially, what it does is, let's say I have that example where I have 8 switches span, I need to get it into 2 ports, it's just going to multiplex all that traffic from those 8 ingest ports and send it out, 2 of them, to my analysis solution, right? And in fact, most of them can feed multiple analysis solutions. So maybe you have high speed packet capture, you've got an IDS, right? You've got some, you know, network detection and response solution, you want to feed all those.
[34:31.470 --> 34:43.590]  So nice thing is, you know, you just have one set of spans and taps, feed it into your aggregation switch, and then it can feed multiple solutions that are connected to it.
[34:43.630 --> 34:59.330]  And a couple things to add as well. So if you are thinking about taps at any sort of enterprise scale, you have to be thinking about span aggregation as well. Like, it is not an option to not consider that, not factor that into your deployment.
[34:59.330 --> 35:11.270]  Yeah, one thing we didn't mention about taps that maybe we should have, if you have a single 10 gig fiber tap, you might say, okay, I'm going to tap one link. So how many links do you actually have that feed your analysis solution?
[35:11.270 --> 35:21.590]  It's not one, it's actually two, because the way taps work is they feed you the ingress and the egress on that link, and they separate it out into two different links.
[35:21.590 --> 35:31.910]  And they do that because, as I mentioned before, it's a full duplex link, and taps want to do perfect preservation. They give you two outputs so that there is no packet loss at all possible.
[35:32.330 --> 35:39.850]  So Chris is exactly right. I mean, when you're talking taps, you can be talking a huge amount of links if you do them at any scale at all.
[35:39.850 --> 35:48.310]  One additional thing, we've mentioned ER-SPAN a couple of times. We've also mentioned VXLAN in the case of a Spineleaf deployment.
[35:48.990 --> 36:00.450]  SPAN aggregation solutions are capable a lot of times of doing decapsulation to get all of your traffic into whatever the base level traffic pattern that you want to analyze.
[36:00.450 --> 36:08.850]  So you want just the raw traffic that a host would see, for example. Great, fantastic. A lot of SPAN aggregation solutions can get it to that specific...
[36:08.850 --> 36:22.490]  And that's really important because if you feed, let's say Spineleaf encapsulated VXLAN traffic to an analysis solution and it's not decapsulated and that solution can't decapsulate it, it's not going to do anything for you.
[36:23.450 --> 36:32.450]  You're going to be like, what are these IP addresses? Because they're all going to be the underlay network and it's not going to be your actual host IP addresses.
[36:33.510 --> 36:44.910]  So some other things aggregation switches do, filtering traffic, they do deduplication, which we're going to talk about as well.
[36:44.910 --> 36:52.750]  So let's talk a little bit about filtering. So this is probably one of the most important things when you're building these networks out.
[36:52.750 --> 37:04.390]  So number one, packet analysis resources are not infinite, right? Nothing can accept infinite bandwidth. And writing packets to disks is also not infinite, right? You don't have an infinite disk space.
[37:04.390 --> 37:10.890]  In fact, if you have, let's say, 100 gigabits per second link, that will fill up one petabyte of disk per day approximately.
[37:11.590 --> 37:18.050]  And it's funny, when I meet with customers, they'll tell me, hey, I want 30 days of lookback. And I go, okay, what's your average throughput?
[37:18.050 --> 37:29.150]  And they might say something, I don't know, five gigabits, 10 gigabits, whatever it is, you can literally go on Google and type like 10 gigabits per second, if you type the words out, to terabytes per day.
[37:29.190 --> 37:36.830]  And then you can shock them and let them know like, oh, yeah, actually, you're going to need like 10 petabytes of disk if you want to keep that much data.
[37:37.150 --> 37:41.830]  And then of course, they're like, well, maybe we don't need to keep that.
[37:41.830 --> 37:48.950]  Some people can't dedicate whole data centers to storing PCAP. So it's just how much money you want to throw at the problem.
[37:49.450 --> 37:55.910]  Yeah, I mean, there's government agencies, right, that just have stacks and stacks and stacks of storage to do that.
[37:55.910 --> 38:04.290]  But, you know, even if some of the, you know, we've worked with some of the largest companies in the world, and, you know, they don't have infinite money, like, you know, any more than anyone else.
[38:05.070 --> 38:16.670]  So one of the things that can really help you, though, is filtering. Because there is a, you know, to use a non-technical term, if you're spending a bunch of traffic, there's a good chance you're going to get some junk in there that you don't really care about.
[38:17.830 --> 38:26.510]  You know, things like backup traffic, data replication traffic, that stuff, it's just not typically going to be very useful to look at.
[38:26.510 --> 38:40.130]  And it can consume a massive amount of bandwidth and disk space. Like, I have customers where I'd say 30-40% of their traffic in a 24-hour period is backup traffic, if I were to spend, you know, most of the traffic in their data center.
[38:40.230 --> 38:46.130]  So we're going to, if we get that in, we're going to filter it out. Some other good examples are going to be your VM infrastructure traffic.
[38:46.130 --> 38:57.610]  So if you use NFS, for example, to mount your disk that all your VMs use, you don't want to, you typically do not want to spend that. It's going to be a massive amount of traffic. Or if you use iSCSI, same thing.
[38:57.910 --> 39:03.710]  vMotion traffic is another one. It's, you know, it's a ton of traffic, and it's just generally not going to be that interesting.
[39:03.970 --> 39:14.290]  And then finally, telemetry data can also consume a ton of bandwidth. So syslog, ELF, Elastic, NetFlow, etc.
[39:15.990 --> 39:31.470]  Splunk replication, one of the chattier things that I've seen on the network. There's certain network links where I've seen Splunk replication in particular consuming 30 to 40% of the total available bandwidth.
[39:31.470 --> 39:35.310]  So, filter it out. Get it out. I don't care.
[39:39.940 --> 39:50.420]  Alright, now, what's the other big thing besides filtering? Duplicate traffic, right? And that's, how do we handle duplicate traffic? Well, there's a couple ways. One way is deduplication.
[39:50.800 --> 40:03.260]  So, when you do this, if you haven't, you know, done a lot of scanning and tapping before, you're going to get duplicate traffic. It's normal. Don't freak out. Expect to see some of it, right?
[40:03.260 --> 40:20.220]  The problem is when you get a lot of it, right? There's negative impacts of having tons and tons of duplicate traffic. Number one, you know, it's like, okay, if I have a duplicate of every single packet on my network, and I don't have any way to deduplicate it, it looks like I have twice as much traffic. It's going to waste a ton of lookback space.
[40:22.040 --> 40:35.540]  Let's say I bought some packet analysis solution that can handle up to 10 gig of traffic. Well, if I'm feeding it two copies of every packet, I'm literally wasting half of its capability, right? And it can also make manual analysis more difficult.
[40:35.540 --> 40:46.120]  You know, if any of you have ever opened up a Wireshark trace, and there's like six copies of every packet, it can be annoying to look at. I mean, there's ways to filter it out and stuff, but it can just make it more of a pain.
[40:47.360 --> 40:58.720]  So a couple of solutions here. So one is going to be deduplication, which is a feature essentially that's supported by some aggregation switches and some packet analysis solutions.
[40:59.560 --> 41:11.740]  And my general rule of thumb is that it works pretty well for lower levels of duplicate traffic, right? As I said, it's going to be normal to see some amount of duplicate traffic, and deduplication can handle that pretty well.
[41:13.140 --> 41:27.120]  However, when you start getting a lot of duplicate traffic, sometimes these deduplication solutions just fall over. And it's not because they're poorly designed or bad or any of that. It's because it's a hard problem to solve that requires a lot of resources.
[41:27.120 --> 41:38.860]  If I'm feeding 10 gigabits of traffic, essentially the way they work is they have a buffer, they have to buffer the traffic, they have a window that they look at, where they look, you know, if they see there's various heuristics you can use to identify duplicate packets.
[41:38.900 --> 41:55.020]  But within that window, they try to identify them, right? But again, when you're talking 10 gig, 20 gig, 40 gig of traffic, it just requires a ton of memory and resources to buffer that and look any more than a few milliseconds for duplicates, right?
[41:55.020 --> 42:08.960]  So what do you do when you have tons of duplicate traffic? Well, usually the reason is, is because you're spanning the same traffic or tapping the same traffic in a bunch of different places. And we're going to go back to the last slide, and we're going to filter that.
[42:08.960 --> 42:35.120]  So, in my experience, once you get above, you know, 5, 10, maybe 15% mean when you're up at like 50% duplicate and higher, you need to go filter. And you can filter at your where you're actually doing the scan, you can filter at the aggregation switch. And one of the things that I've found that works pretty well a lot of times is identifying, you know, let's say I have, you know, as an example, let's say I have a firewall that's passive and I span both sides.
[42:35.120 --> 42:57.060]  Well, I'm going to get the same package twice, right? Every time I want to filter that traffic out, essentially, all I have to do is figure out which MAC address is on which side of the firewall, right? And just filter the same traffic on both sides, you know, other than whatever gets blocked. And then I can use like a MAC address filter like that to pretty easily get rid of that traffic.
[42:57.060 --> 43:06.680]  So just as a general rule, I find that, you know, facing your filters, which by the way, most aggregation switches can do, facing your filters on MAC addresses is a pretty good approach.
[43:31.020 --> 43:59.680]  As much as I wish Amazon or Google would let me into their data centers and deploy taps and spans, I don't think they're so keen on doing that. And so what we've actually seen is the top three cloud vendors, AWS, GCP, and Azure have all announced, and in the case of AWS and GCP, made generally available span as a service.
[43:59.960 --> 44:27.060]  Which I think I made up. But it is the same thing as a span on prem, where you say, I have a specific set of cloud workloads, whether that's a specific instance, whether that's an ENI, whether that is an entire VPC, and you say, feed that into a specific monitoring interface, feed it into whatever analysis tool that I want to grab those packets from.
[44:27.060 --> 44:43.920]  In the case of Azure, they actually announced in 2018, it is still in private preview, just as a heads up. So for all the Azure customers out there, push on them. I know we are, because AWS and GCP can do it today.
[44:43.920 --> 45:06.700]  Yeah, so one thing I'll add here, that's pretty nice is, believe it or not, even though it's in the cloud, this stuff is actually quite a bit easier sometimes than on prem spans, because you literally like in AWS, you literally just say, okay, I have these, you know, 520 servers, I want to span or these 300 servers I want the traffic from, and you just can just turn on the traffic mirroring for them.
[45:06.700 --> 45:20.340]  There's no figuring out which VLAN they're on, or any of the things you have to deal with in an on prem data center, right? It's just they handle the implementation details under the covers, you just say, okay, I'm going to pay some money and get my traffic.
[45:20.340 --> 45:37.660]  It's all API based, right? So you go get BOTO, or pick your poison, and you have give me all of these tagged instances, give me their data, you don't have to coordinate with network teams, you just get the data as soon as you want.
[45:40.080 --> 46:02.020]  Containers, you're running Kubernetes, you're running Docker, they have this similar challenge to VMs in that there is traffic that never hits a physical network, or it's encapsulated in such a way where when it hits the physical network, it's hard to tell what the actual data is.
[46:02.700 --> 46:23.160]  So first thing, because it's similar challenge to VMs, you actually have similar solution, tap a physical network and validate what you are and are not seeing. Again, in our experience, you actually see a decent amount of the critical traffic, especially the inter service traffic.
[46:23.160 --> 46:34.360]  What you won't necessarily see is inter pod traffic. But we as an industry are still trying to decide on what the best way to actually capture this traffic is.
[46:34.360 --> 46:47.820]  You can deploy a daemon set, which is a specific pod that you run on every single node within your K8s cluster. And so you can hook into the node networking and use that as a source of data.
[46:47.820 --> 47:16.180]  You can also run a sidecar, which is a container that is run in the pod and use that to get inter pod intra node traffic as well. And some of the technologies that we're considering using the namespace hooking into the Docker zero interface and actually tapping those internal localhost interfaces as a way of getting to some of that data.
[47:18.520 --> 47:37.440]  So with that, some of our takeaways, both Pete and I have been working with packets for a long time. And I think they're one of the arguably best sources of data, not just because of the richness of the data that's there, but because of how accessible they are.
[47:37.440 --> 47:45.740]  You're not having to go find every host that you care about on the network, and instead can target the network as a whole.
[47:47.140 --> 47:58.940]  I don't know about you, Pete, but it seems like every network is kind of the same, but they're all unique in their own fun and interesting ways.
[48:02.740 --> 48:15.240]  But that doesn't mean that you can't use general principles as you start designing these architectures, and then modify them, use different strategies for different problems.
[48:15.900 --> 48:20.360]  And that's, there's many different strategies, depending on the network.
[48:21.240 --> 48:32.380]  And I would say these last two points are probably the most important. That identify critical assets step is by far the most important thing that you can do.
[48:32.540 --> 48:41.220]  We can set aside all the technology implementation, all of that. If you don't, you're identifying critical assets, that's essentially your problem statement.
[48:41.220 --> 48:45.920]  And if you don't have a good problem statement, it's going to be hard to solve that problem.
[48:46.760 --> 48:53.640]  And I also think that don't let the perfect be the enemy of good is a really important point, too. And I'm going to go back to that log analogy.
[48:53.660 --> 49:03.880]  Again, this is very similar. No one is going to capture every single log that is ever written by every single hosting device in a large environment.
[49:03.880 --> 49:09.240]  It's just not going to happen. And this is the same thing here. You are never going to capture every single packet.
[49:09.240 --> 49:12.260]  That doesn't mean that this can't be incredibly valuable.
[49:12.260 --> 49:20.220]  And what I've found is getting, let's say, 90% of what you want, it's not too hard.
[49:20.220 --> 49:24.580]  Then it's twice as hard to get the next 5%, to go to 95%.
[49:24.580 --> 49:30.280]  Then if you go to 97.5%, it's twice as hard as it was to get to 95%.
[49:30.280 --> 49:33.040]  And it's diminishing returns.
[49:33.560 --> 49:38.600]  So again, you want to identify those critical assets and don't let the perfect be the enemy of the good.
[49:38.600 --> 49:43.240]  As long as you have most of the stuff you need, most of the time,
[49:43.240 --> 49:50.120]  I think you can provide some incredibly valuable visibility and investigative capabilities to your organization.
[49:50.980 --> 49:55.500]  Perfect. So with that, thank you for your time.
[49:55.500 --> 49:58.520]  Thanks for coming to DEF CON Safe Mode.
[49:59.040 --> 50:01.820]  We're thankful to get the chance to speak to you guys.
[50:01.820 --> 50:05.900]  So we'll leave it open to any questions.
